Efficient pyramid context encoding and feature embedding for semantic segmentation

نویسندگان

چکیده

For reality applications of semantic segmentation, inference speed and memory usage are two important factors. To address these challenges, we propose a lightweight feature pyramid encoding network (FPENet) for segmentation with good trade-off between accuracy speed. We use series (FPE) blocks to encode context at multiple scales in the encoder. Each FPE block consists different depthwise dilated convolutions that perform as spatial extract features reduce computational costs. During training, one-shot neural architecture search algorithm is adopted find optimal structure each from large space small cost. After encoder, mutual embedding upsample module introduced decoder, consisting attention blocks. The encoder-decoder mechanism used help aggregate efficiently high-level low-level details. proposed outperforms existing real-time methods fewer parameters improved on Cityscapes CamVid benchmark datasets. Specifically, it achieved 72.3% mean IoU test set only 0.4 M 192.6 FPS an Nvidia Titan V100 GPU, 73.4% 116.2 when running higher resolution images.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Context Encoding for Semantic Segmentation

Recent work has made significant progress in improving spatial resolution for pixelwise labeling with Fully Convolutional Network (FCN) framework by employing Dilated/Atrous convolution, utilizing multi-scale features and refining boundaries. In this paper, we explore the impact of global contextual information in semantic segmentation by introducing the Context Encoding Module, which captures ...

متن کامل

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints. ESPNet is based on a new convolutional module, efficient spatial pyramid (ESP), which is efficient in terms of computation, memory, and power. ESPNet is 22 times faster (on a standard GPU) and 180 times smaller than the stateof-the-art semantic ...

متن کامل

Volumetric Semantic Segmentation using Pyramid Context Features Supplemental Material

Our “pyramid filtering” insight enables exact and extremely efficient per-voxel classification with minimal memory overhead. We will now demonstrate our improvement over existing techniques empirically and theoretically. For an empirical demonstration of efficiency, consider the following two alternatives to pyramid filtering: 1) the “sliding window” approach: iterate through every voxel in the...

متن کامل

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense, pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localizat...

متن کامل

Context Tricks for Cheap Semantic Segmentation

Accurate semantic labeling of image pixels is difficult because intra-class variability is often greater than inter-class variability. In turn, fast semantic segmentation is hard because accurate models are usually too complicated to also run quickly at test-time. Our experience with building and running semantic segmentation systems has also shown a reasonably obvious bottleneck on model compl...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Image and Vision Computing

سال: 2021

ISSN: ['0262-8856', '1872-8138']

DOI: https://doi.org/10.1016/j.imavis.2021.104195